A database of high-pressure crystal structures from hydrogen to lanthanum

This paper introduces the HEX (High-pressure Elemental Xstals) database, a complete database of the ground-state crystal structures of the first 57 elements of the periodic table, from H to La, at 0, 100, 200 and 300 GPa. HEX aims to provide a unified reference for high-pressure research, by compiling all available experimental information on elements at high pressure, and complementing it with the results of accurate evolutionary crystal structure prediction runs based on Density Functional Theory. Besides offering a much-needed reference, our work also serves as a benchmark of the accuracy of current ab-initio methods for crystal structure prediction. We find that, in 98% of the cases in which experimental information is available, ab-initio crystal structure prediction yields structures which either coincide or are degenerate in enthalpy to within 300 K with experimental ones. The main manuscript contains synthetic tables and figures, while the Crystallographic Information File (cif) for all structures can be downloaded from the related figshare online repository.


Background & Summary
The advent of 21st century marks a pivotal moment for high-pressure research: advancements diamond anvil cells design and in-situ characterization techniques [1][2][3][4] gave access to the realm of multi-megabar pressures, revealing unexpected and fascinating phenomena, such as high-temperature conventional superconductivity in H 3 S 5,6 , LaH 10 7,8 and other superhydrides 2 , metal-insulator transition in elemental sodium 9 , self-ionization of boron 10 , electride behavior in alkali metals 11 , noble-gas solids 12 , etc.
Until the turn of the century, knowledge on the behaviour of matter at high pressure was limited and based on indirect evidence.The general expectation was that all matter would tend to become homogeneous and metallic to maximize the electronic kinetic energy.However, experiments over the last 30 years revealed a much more varied behaviour defying this naïve expectation.Compounds at high pressures often adopt exotic crystal structures, whose stoichiometries, motifs and moieties defy fundamental chemical concepts, such as valence and electronegativity, which govern the behaviour of matter at ambient pressure 13 .The most striking examples of this so-called forbidden chemistry are highlighted in several excellent review papers [14][15][16][17][18] , which also offer a glimpse on the underlying physical mechanisms, such as polymerization, rearrangements of atomic orbital energies, interstitial charge localization, etc.
Ab-initio calculations based on Density Functional Theory (DFT) have played a pivotal role in high-pressure research.Nowadays, these methods permit not only to describe known phases from a microscopic quantum-mechanical viewpoint, but also to predict new structures and properties.The famous Maddox paradox, according to which the quantum mechanical methods for material modelling cannot be considered fully predictive, unless they can predict crystal structures from the knowledge of the sole chemical composition, has finally been overcome 19 .In fact, modern techniques for crystal structure prediction have proven their predictive power over a variety of systems, with an astounding agreement with experimental observations 20 .These techniques utilize clever optimization strategies to identify the global and local minima of the potential energy surface Table 1.Computational details of the multi-step DFT relaxation procedure employed in evolutionary crystal structure prediction runs; ENMIN/ENMAX indicate the minimum/maximum kinetic energy cutoff values reported in the VASP pseudopotential files; ΔE and ΔF indicate the total energy and force convergence criteria, respectively.The final row of the table contains the settings used for the final relaxation of the EA-generated structures as well as literature structures.

Continued
(PES) associated with a given set of atoms, which correspond to the ground-state and metastable structures 21 , respectively.Commonly-employed methods include simulated annealing, ab-initio random structure search 22 , metadynamics 23 , minima hopping 24,25 , evolutionary algorithms [26][27][28] , particle swarm optimization 29,30 , etc. Indeed, thanks to the increasing integration between experimental and computational methodologies, the knowledge on high-pressure crystal structures has experienced significant advancements in recent years.However, the relative information is still largely incomplete, and spread over several databases and publications, whose standards vary significantly.A large portion of these sources is either unaccessible due to paywalls or rely on outdated conventions.Even for the most basic systems, such as mono-elemental solids, it is frequently challenging to find complete crystal structure information for the entire range of experimentally-accessible pressures, which nowadays exceed 400 GPa.In fact, particularly at higher pressures, the only available crystal structures information derives from computational predictions, which significantly differ in terms of breadth and accuracy.
The aim of the High-pressure Elemental Xstals database 31 (HEX database) is to provide a single open-access, easily accessible and well-organized database containing the crystal structures of the first 57 elements of the periodic table (Hydrogen-Lanthanum) at pressures of 0, 100, 200 and 300 GPa.The database has been constructed compiling all available literature, and comparing with the results of highly-accurate evolutionary crystal structure prediction calculations 26 , based on plane-wave pseudopotential Density Functional Theory (DFT) total energies.Our choice to exclude elements beyond lanthanum is motivated by the need to maintain a consistent accuracy throughout the database: Elements in the lanthanide and actinide series have been excluded, due to the inadequacy of the pseudopotential approximation for elements with open f-shells, while other heavy elements were discarded, because significant spin-orbit interaction may introduce further sources of inaccuracy in the calculations.In order to maintain the computational cost manageable, our evolutionary crystal structure      97 Continued prediction runs employ 8-atoms unit cells, and neglect zero-point energy (ZPE) corrections, which should however be negligible for elements beyond the first rows.The primary aim of this work is to provide a complete and accurate reference for researchers in various fields.Moreover, by presenting a systematic comparison of high-quality crystal structure prediction results with literature data, the HEX database 31 also gives an extensive benchmark of the accuracy of crystal structure prediction methods on elemental crystal structures, which nicely complements existing blind tests on molecules 32 .We find that evolutionary algorithm (EA) predictions reproduce known experimental results in over 95% of the cases; most of the observed deviations can be attributed to the use of too small unit cells.

Methods
Data contained in the HEX database 31 were generated by combining literature data with results of evolutionary crystal structure prediction runs.
We performed a thorough screening of the available literature to identify the ground-state crystal structures of the first 57 elements of the periodic table (H-La), at 0, 100, 200 and 300 GPa.Moreover, we performed unconstrained ab-initio EA searches for each element and pressure, as explained below.The structures obtained from the two sources underwent a final relaxation and symmetrization employing the same convergence criteria.This allowed us to compare the total energies/enthalpies to determine a single ground-state crystal structure for each element and pressure; taken all together, these structures form the first sub-database -(Database Ground-State).We also created two other sub-databases, one containing all structures predicted by EA runs -(Database Evolutionary Algorithm), and the other containing all literature (LIT) structures which turned out to be less energetically favorable than the EA ones (Database Mismatch).The content and structure of the three sub-databases is described in detail in the Data Records section; here we describe in detail the generation procedure.
• EA-generated structures: The bulk of our work involved crystal structure prediction runs for the first 57 elements of the periodic table (H-La), over a wide range of pressures.We employed evolutionary algorithms as implemented in the Universal Structure Predictor: Evolutionary Xtallography (USPEX) code [33][34][35] .Structural searches for each element were carried out at 0, 100, 200, 300 GPa to identify the lowest-enthalpy structure.
The underlying structural relaxations and total energy calculations are based on Density Functional Theory (DFT), as implemented in the Vienna Ab Initio Simulation Package (VASP) 36,37 .We employed Projector Augmented Wave pseudopotentials 38,39     Continued exchange-correlation functional 40 .For reciprocal k-space integration we used uniform Monkhorst-Pack grids 41 with Methfessel-Paxton smearing 42 (See Table 1 for further details).
For each combination of element and pressure, we performed EA searches with an 8-atom unit cell.The first generations contained 40 structures (Individuals), while each of the following generation contained 20 structures.Each individual was fully relaxed, following a five-step relaxation procedure with increasing accuracy; the relevant parameters are summarized in Table 1.Crystal structure prediction runs lasted for a maximum of 20 generations, and were considered converged when the the lowest-enthalpy structure remained the same for 7 consecutive generations.Once the evolutionary algorithm search was converged, we collected the ten lowest-enthalpy structures for each element and pressure.These structures underwent a final relaxation, with tighter criteria listed in the final row of Table 1, and finally symmetrized, using the FINDSYM algorithm by Stokes et al. 43,44 , with a tolerance criterion of 0.2 Å.The lowest-enthalpy structure after symmetrization for each element and pressure was selected as the EA ground-state structure.If, after the final relaxation and symmetrization, we found more than one structure to be degenerate in enthalpy within 26 meV (i.e.k B T for T = 300 K), we selected the highest-symmetry one.• Literature search: We performed a thorough screening of the existing literature on the crystal structures of the first 57 elements of the periodic table (H-La), at 0, 100, 200 and 300 GPa.We chose experimental references rather than theoretical ones, when available, and more recent papers were selected in favour of older ones.Our bibliographic search was performed as comprehensively as possible using multiple queries and strategies.However, we cannot rule out that we may have missed some references.

Continued
The experimental structures at ambient pressure were extracted from the American Mineralogist Crystal Structure Database 45 , while information on higher pressure was obtained from multiple sources.
All references, along with the indication on whether they refer to a theoretical or experimental work, are reported in the Ref column of Tables 2-10.Once identified, structures extracted from literature underwent a single run of structural relaxation, with the same settings used for the final relaxation of EA-generated structures before their energies were compared with the EA results.The parameters reported in the tables refer to this final relaxation.

Data records
Our HEX database 31 comprises the three sub-databases described below.Details of the relative structures are reported in the Tables 2-10; the corresponding CIF files can be found at figshare https://doi.org/10.6084/m9.figshare.c.7119778.v1.

• DB_GS (Database Ground-State):
The main sub-database includes the ground-state structure for each element at 0, 100, 200, 300 GPa, obtained by comparing the result of our evolutionary crystal structure prediction runs (EA structures) with the structures obtained from the screening of the literature (LIT structures), when available.The columns of Tables 2-5 contain the atomic number Z, element symbol, space group, unit cell volume (per atom), and the Wyckoff positions of the ground-state structures; the column Source specifies whether the lowest-energy structure was found through EA runs (ea), or in literature (lit); an asterisk (*) indicates that the EA-generated structure agrees with the literature, while a dash (−) indicates that we could not find a literature reference for the relevant element and pressure (unreported structures).In cases where the difference in enthalpy between the EA-generated structure and the literature one was below 26 meV/atom, the structures were considered to be degenerate.Continued the literature reference.We indicate in bold-face the entries for which the EA predictions are unsuccessful, i.e. cases in which the EA-predicted structures are neither matching nor degenerate with avaiable experimental data.• DB_MISS (Database Miss): This database contains the list of literature structures less stable than EA-generated ones, and hence not included in the ground-state tables.The structures for all pressures are grouped into a single table -Table 10.The columns contain the atomic number Z, the element symbol, the space group when available, the unit cell volume (per atom), the Wyckoff positions, the enthalpy relative to the    ) specifies whether the literature source is computational (experimental).N/A indicates that the available information is too incomplete to completely characterize the structure (unreproducible).
ground-state, and the literature reference, together with the indication whether the literature reference is theoretical or experimental.We also indicate explicitely when literature references did not report enough structural information to allow for a comparison with EA-generated structures (non-reproducible in the following).
In Fig. 1 the trends in the evolution of the crystal structure of the elements with pressure are summarized in graphical form.The four periodic tables indicate, for each element, the lattice system of the ground-state crystal structure at pressures of 0, 100, 200 and 300 GPa: Monoclininc (3-15), Orthorombic (16-74), Tetragonal (75-142), Trigonal (143-167), Hexagonal (168-194), and cubic (195-230).Bravais lattice types are indicated by a color scale, from purple to yellow.
The figure shows that for most elements the evolution of the crystal structure with pressure does not follow the naïve expectation that all matter should become more homogeneous under pressure by adopting more close-packed structures.In fact, except for transition metals and noble gases, which adopt either face-or body-centered cubic or hexagonal close-packed structures over the whole range of pressures, other elements undergo a series of transitions, sometimes leading to very complex structures, which may exhibit lower symmetries than ambient-pressure ones.The observed deviation from hard-sphere close-packing at high-pressure can originate from different physical mechanisms: charge localization in interstitial sites (electride behavior), in alkali and alkali metals; stabilization of polymeric or molecular phases, in pnictides, chalcogenides and halides; repopulation of atomic orbitals, leading to change in formal valence, as in III and IV-row elements 16,18 .

technical Validation
Validation is an intrinsic part of our work, which comprised a thorough comparison of the results of extensive evolutionary algorithm searches, sampling over 70.000 structures, with available literature data.
Figure 2 summarizes the current status of knowledge of high-pressure (HP) structures and presents a comparison with EA-generated structures.The bar chart indicates for each pressure the amount of information available in literature on the structures of the first 57 elements.Structures for each element are divided into Unreported, Theory, Experiment, depending on whether any information is available in literature, and if the source is an experiment or a theoretical prediction -The column Total is the sum of Theory and Experiment.The bars are colored to indicate whether our EA-prediction runs were succesful/unsuccesful in reproducing literature data.A succesful prediction implies that the EA-predicted structure is either exactly matching the literature structure or degenerate with it to within 300 K (26 meV).Cases in which literature information did not contain enough data to fully reproduce the structures are indicated as Non-reproducible.
While at ambient pressure the structures of all these elements have been experimentally determined and are collected in American Mineralogist Crystal Structure Database 45 , as pressure increases fewer and fewer experimental reports of high-pressure elemental phases can be found.For example, at 300 GPa, experimental information is available for only about 15% of the 57 elements considered in this work; about twice the same amount of structures can be recovered from theoretical predictions, but for more than 50% the structure is unreported.
In general there is a remarkable agreement between our EA predictions and experiment.Moreover, we find that for most cases where we could not identify any literature reference, our EA calculations predict that the elements will retain the same crystal structure measured at lower pressures.In the rare cases in which we observe a disagreement between EA predictions and experiments a posteriori it is easy to find very plausible explanations, discussed in the following.
In the right panel of Fig. 2 we use a pie chart to quantify the success rate of EA predictions.The comparison in this case involves only cases for which full experimental information is available.On average, we find that ~ 98% of the EA predictions were succesful, i.e.EA either predicted the same structure as experiment (matching structures), or a structure degenerate with it to within 26 meV.
• Ambient pressure (0 GPa): Of the 57 papers found in literature for 0 GPa, 36 reports are matching with our studies.Of the remaining 21 mismatching cases, 17 are degenerate in enthalpy.This means that 53 structures can labeled as succesful.
In a few cases, the original mismatch between EA predictions and experiment was eliminated including corrections to the standard GGA functional used for all our calculations.In particular, for Br and I, marked with daggers in the tables, the experimental ground-state structures become degenerate in enthalpy with our calculated structures after adding Van-Der-Waals corrections.While experimentally these elements form molecular crystals, the structures we predict contain zig-zag polymeric chains.Since the two types of structures are almost degenerate in energy, it is conceivable that, depending on the activation energy and temperature dependence of the polymerization, also polymeric structures might be experimentally realizable.
In order for the EA predictions to match experiments for for Fe, Co and Ni, we had to include spin polarization in the calculations.These enetries are marked with asterisks in the table.
Of the four unsuccessful structures, B, S and Mn have a ground-state characterized by cells much larger than Fig. 3 EA-predicted crystal structures for elements and pressure where the experimental information is either completely missing, or too incomplete to reconstruct the structure.We leave out trivial cases in which the structure is a monoatomic fcc, bcc or hcp one.Structures are labelled as: Element-Space Group number and pressure.
the 8 atoms cell we considered for our EA searches, while for tellurium, we believe that the source of the discrepancy may be a substantial role of spin-orbit effects, which are neglected in our calculations.
In synthesis, at 0 GPa group 93% of the EA predictions can be defined successful, according to our criteria.• 100 GPa Group: Of the 44 structures reported in literature for 100 GPa, our EA predictions are matching for 34 elements.Of the remaining 10 mismatching cases, 8 are degenerate in enthalpy.For the remaining two elements, S and In, literature references did not contain enough information to fully reconstruct the structures, only the Bravais lattices -bco for S 46 and bct for In 47 .Hence, they should be classified as unreported.At 100 GPa, our EA structures are successful in reproducing the literature data in 100% of the cases where complete experimental information was available.• 200 GPa Group: Of the 41 papers found in literature for 200 GPa, our EA predictions are matching in 34 cases.
Of the remaining 7 mismatching cases, 4 are degenerate.We have not been able to gather enough information to perform calculations on the reported phases for N and Sc, which should then be classified as unreported.
The EA-predicted structure for Ni (fcc) is more stable than the bcc phase predicted by Belashenko et al. 48.Including spin-polarization in the calculation does not modify this result.It is likely that strong correlation effects may solve the discrepancy 49 .At 200 GPa, taking into consideration only fully experimentally-determined structures, successful predictions are hence 100% of the total.• 300 GPa Group: Our EA predictions match 18 of the 25 structures reported in literature for 300 GPa.Of the 7 mismatching cases, 4 are degenerate in enthalpy.Of the remaining elements, the reference reported for N did not contain enough information to fully determine the crystal structure 50 , and should then be considered unreported.For Li 51 and Y 52 , the structures we obtained were found to be less stable than theoretical predictions in literature, which however employed much larger unit cells.
In summary, at 300 GPa, our EA structures reproduced literature results in 92% of the cases.Taking into consideration only fully-determined experimental structures, the fraction of successful predictions rises to 100%.
An exciting outcome of our work is that evolutionary crystal structure predictions based on Density Functional Theory are extremely accurate: on average 96% of structures available in literature were predicted correctly (98% considering only fully-determined experimental structures).In all but two cases where EA-predicted structures could not reproduce the ground-state structures from the literature, we could attribute this either to physical effects not included in our original computational setup (vdW interactions, magnetism, spin-orbit coupling) or to the choice of a too small unit cell.The only two cases for which we could not find a simple explanation are Te at ambient pressure, and Ni at 200 GPa.
In Fig. 3, we show EA-generated crystal structures for the 21 cases which we believe may be of interest for future studies, labeled with their element, space group number, and pressure.In all these cases, experimental information is either not available at all, or too incomplete to completely determine the structure.We decided to not show, however, trivial cases in which EA predicted is a monoatomic bcc, fcc or hcp ground-state structures.
Of the structures shown in figure, hydrogen and oxygen tend to form such strong bonds, that they form molecular crystals up to the the highest pressure considered in this work.Nitrogen and boron, whose covalent bonds are more prone to frustration, form complex crystalline polymers.Lithium and phosphorous form complex, high-symmetry phases with large unit cells.Heavier elements tend to form less exotic structures, mainly tetragonal distorsions of cubic structures.We note that the qualitative behavior is consistent with what is observed in other elements, where high-pressure data is available.

Usage Notes
Data are stored on figshare https://doi.org/10.6084/m9.figshare.c.7119778.v1, in two separate compressed zip archives.The first archive -HEX.zip-contains three folders, one for each of the databases described in the text (GS, EA, MISS).Moreover, each folder contains four sub-folders, one for each pressure.The sub-folders contain files in the standard Crystallographic Information File (cif), named as ELEMENT_PRESSURE_DATABASE.cif (DATABASE = GS, EA, MISS).The second archive -Evolutionary.zip-contain the input files used for the evolutionary prediction runs (USPEX input files + example of VASP INCAR files).

Fig. 2
Fig. 2 (a) The bar chart gives a breakdown of the whole dataset for different pressure into structures (i) Unreported (green) and (ii) reported in literature, divided into Experiment and Theory category.The color of the bar indicates whether EA-predicted structures exactly match or are degenerate with available literature data -successful (blue)/unsuccessful (yellow).(b) Pie chart displaying the fraction of successful EA predictions, considering only fully characterized experimental structures.Red (green) represents the successful (unsuccessful) cases for all pressures.

Table 2 .
Ground-state database (DB_GS) at 0 GPa.The symbols in the Source column indicate: (i) * the structure is matching; (ii) -no reference could be found in literature; (iii) lit./ea.the ground-state structure originates from literature/evolutionary algorithm.In the Ref column, (i) th.(exp.)

Table 4 .
Ground -state database (DB_GS) at 200 GPa.The symbols in the Source column indicate: (i) * the structure is matching; (ii) -no reference could be found in literature; (iii) lit./ea.the ground-state structure originates from literature/evolutionary algorithm.In the Ref column, (i) th.(exp.)specifies whether the literature source is computational (experimental), or (ii) -missing.

Table 5 .
Ground-state database (DB_GS) at 300 GPa.The symbols in the Source column indicate: (i) * the structure is matching; (ii) -no reference could be found in literature; (iii) lit./ea.the ground-state structure originates from literature/evolutionary algorithm.In the Ref column, (i) th.(exp.)specifies whether the literature source is computational (experimental), or (ii) -missing.

Table 6 .
Evolutionary 75gorithm database (DB_EA) at 0 GPa.In the ΔH EA-GS column, (i) 0 indicates that the EA crystal structure is lower in enthalphy than the corresponding LIT crystal structure; (ii) <26 that the EA crystal structure is degenerate in enthalpy with the corresponding LIT crystal structure, (iii) Otherwise, it indicates the difference in enthalpy between the ground-state crystal structure and the EA-generated crystal structure, in meV/atom.In the Ref column, (i) th.(exp.)indicatesthat the literature source is computational (experimental), or (ii) -missing; (iii) Fe*, Co* and Ni* indicate that the calculation is spin-polarized, with a magnetic moment of 2.22, 1.74, 0.606 a.u.respectively 74 and (iv) Br † and I † , that the calculation includes vdW interactions through the opt88-vdW exchange-correlation functional75.

Table 7 .
Evolutionary algorithm database (DB_EA) at 100 GPa.In the ΔH EA−GS column, (i) 0 indicates that the EA crystal structure is lower in enthalphy than the corresponding LIT crystal structure; (ii) <26 that the EA crystal structure is degenerate in enthalpy with the corresponding LIT crystal structure, (iii) Otherwise, it indicates the difference in enthalpy between the ground-state crystal structure and the EA-generated crystal structure, in meV/atom.In the Ref column, (i) th.(exp.)indicates that the literature source is computational (experimental), or (ii) -missing.

22 (4e) x = −0.24789 y = −0.08441 z = 0.29798 0 exp. 90
In the following, structures for which literature and EA results are the same are named matching, while those different are named mismatching.The column Ref reports the literature reference.
• DB_EA (Database Evolutionary Algorithm): This database contains the results of our evolutionary algorithm searches for every combination of element and pressure considered.The main results are summarized in Tables6-9.The columns contain the atomic number Z, element symbol, unit cell volume (per atom), the Wyckoff positions, the relative enthalpy compared to that of the ground-state structure.The relative enthalpy ΔH EA-GS is zero in cases where the EA predicts the lowest-enthalpy structure, and < 26 when the difference between EA and LIT ground-state structure is smaller than 26 meV/atom, and positive otherwise.The column Ref reports

Table 8 .
Evolutionary algorithm database (DB_EA) at 200 GPa.In the ΔH EA-GS column, (i) 0 indicates that the EA crystal structure is lower in enthalphy than the corresponding LIT crystal structure; (ii) <26 that the EA crystal structure is degenerate in enthalpy with the corresponding LIT crystal structure, (iii) Otherwise, it indicates the difference in enthalpy between the ground-state crystal structure and the EA-generated crystal structure, in meV/atom.In the Ref column, (i) th.(exp.)indicates that the literature source is computational (experimental), or (ii) -missing.

Table 9 .
Evolutionary algorithm database DB_EA at 300 GPa.In the ΔH EA−GS column, (i) 0 indicates that the EA crystal structure is lower in enthalphy than the corresponding LIT crystal structure; (ii) <26 that the EA crystal structure is degenerate in enthalpy with the corresponding LIT crystal structure, (iii) Otherwise, it indicates the difference in enthalpy between the ground-state crystal structure and the EA-generated crystal structure, in meV/atom.In the Ref column, (i) th.(exp.)indicates that the literature source is computational (experimental), or (ii) -missing.

Table 10 .
Database of structures (DB_MISS) at different pressures.The ΔH EA-GS column represents the difference in enthalpy between the ground-state crystal structure and the LIT crystal structure, in meV/ atom.In the Ref column, (i) th.(exp.